-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add option for warming reads to mirror primary read queries onto replicas from vtgates to warm bufferpools #13206
Conversation
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
If a new flag is being introduced:
If a workflow is added or modified:
Bug fixes
Non-trivial changes
New/Existing features
Backward compatibility
|
@olyazavr this is really cool! 🎉 I wonder if this could more "observable". Is it possible to add Assuming |
@timvaillancourt : metrics in the vtgate about warm-up might generate a large metric volume, which makes me think it is not the best place to have this observability. What would you think of putting these metrics on the vttablet ? |
@jfg956 that's a good idea that has given me a few more ideas 👍 I think a single I think
Last, a deferrable new question: Is there an upper limit to how many queries |
@timvaillancourt those are all really good ideas! I'm waiting to get some sort of approval or comment from Vitess team that this is even a feature they would consider merging before I go and implement more parts of this |
@olyazavr makes sense! They're all deferrable ideas, don't let them block things 👍. Let me know if I can help with anything |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @olyazavr this is awesome, and something I am excited to see land. I work at PlanetScale, my review does not carry the power of approval. I think you can continue to hold off on making changes until someone from the Vitess eng. team chimes in. That said, left some questions and small suggestions.
At a higher-level, had some bigger questions:
- From your experience using this in production, what CPU/memory impact have you observed on VTGates? What about overall performance?
- I think it might be nice to have the option to change this value at runtime. If setting 1-2% for warming has a noticeable impact on overall performance, this might be something we only want to set during a maintenance window prior to PRS. No idea how much work it would be to incorporate structural support for this in your PR. If it's a lot I imagine it could be added later on.
I know that the Vitess eng. team will eventually request an addition to changelog/
and a website PR, so noting that as well. I also think adding or modifying existing E2E tests would be a good idea and something they will likely request.
go/vt/vtgate/engine/route.go
Outdated
return | ||
} | ||
|
||
go func(replicaVCursor VCursor) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to the idea in the main thread to have a warming pool or something else that caps the inflight warming requests.
go/vt/vtgate/executor_select_test.go
Outdated
executor.normalize = true | ||
session := NewSafeSession(&vtgatepb.Session{TargetString: KsTestUnsharded}) | ||
|
||
_, err := executor.Execute(context.Background(), "TestSelect", session, "select age, city from user", map[string]*querypb.BindVariable{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like missing mysqlCtx
argument between the context and the method.
go/vt/vtgate/vcursor_impl.go
Outdated
callerId := callerid.EffectiveCallerIDFromContext(ctx) | ||
immediateCallerId := callerid.ImmediateCallerIDFromContext(ctx) | ||
|
||
timedCtx, _ := context.WithTimeout(context.Background(), 5*time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be nice to pull this 5*time.Second
up into a constant, make it configurable by flag, or maybe default to --query-timeout
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, not sure if this is something we could/woud want to use:
vitess/go/vt/vtgate/engine/route.go
Line 186 in 344280a
func addQueryTimeout(ctx context.Context, vcursor VCursor, queryTimeout int) (context.Context, context.CancelFunc) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or this?
vitess/go/vt/vtgate/safe_session.go
Line 206 in aab7c89
func (session *SafeSession) SetQueryTimeout(queryTimeout int64) { |
go/vt/vtgate/executor_select_test.go
Outdated
executor.normalize = true | ||
session := NewSafeSession(&vtgatepb.Session{TargetString: KsTestUnsharded}) | ||
|
||
_, err := executor.Execute(context.Background(), "TestSelect", session, "select age, city from user", map[string]*querypb.BindVariable{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to modify this test so that it runs through a bunch of different cases which exercise the various case statements you have in executeWarmingReplicaRead
, and also some tests for negatives like insert
, update
, and queries that compose select
with insert
and update
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would also be good to add tests with trailing comments to see how that lands on the replicas.
go/vt/vtgate/executor_select_test.go
Outdated
@@ -1498,7 +1498,7 @@ func TestStreamSelectIN(t *testing.T) { | |||
} | |||
|
|||
func createExecutor(serv *sandboxTopo, cell string, resolver *Resolver) *Executor { | |||
return NewExecutor(context.Background(), serv, cell, resolver, false, false, testBufferSize, cache.DefaultConfig, nil, false, querypb.ExecuteOptions_V3) | |||
return NewExecutor(context.Background(), serv, cell, resolver, false, false, testBufferSize, cache.DefaultConfig, nil, false, querypb.ExecuteOptions_V3, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Query serving team may disagree to me, but might be good to set this to a number greater than 0 to help shake out any issues over the next release cycle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would create nondeterministic replica queries, which can make for flakey tests (we encountered this problem)
go/vt/vtgate/executor_select_test.go
Outdated
@@ -3025,7 +3025,7 @@ func TestStreamOrderByLimitWithMultipleResults(t *testing.T) { | |||
count++ | |||
} | |||
|
|||
executor := NewExecutor(context.Background(), serv, cell, resolver, true, false, testBufferSize, cache.DefaultConfig, nil, false, querypb.ExecuteOptions_V3) | |||
executor := NewExecutor(context.Background(), serv, cell, resolver, true, false, testBufferSize, cache.DefaultConfig, nil, false, querypb.ExecuteOptions_V3, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same thought here and everywhere else.
go/vt/vtgate/engine/primitive.go
Outdated
GetWarmingReadsPercent() int | ||
|
||
// CloneForReplicaWarming clones the VCursor for re-use in warming queries to replicas | ||
CloneForReplicaWarming(ctx context.Context) interface{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why return interface{}
here instead of VCursor
?
go/vt/vtgate/engine/route.go
Outdated
return | ||
} | ||
|
||
_, _ = replicaVCursor.ExecuteMultiShard(ctx, route, rss, queries, false /* rollbackOnError */, false /* autocommit */) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be good to add some stats to add visibility into this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see in the main conversation now that there's talk about getting this at the tablet level. FWIW I think it could be useful here in case for whatever reason the queries fail to reach the tablets.
Good point. I think the viper work is actually ready, if you want to use that to make this dynamically configurable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like the idea of warming reads!
go/vt/vtgate/engine/route.go
Outdated
|
||
func (route *Route) executeWarmingReplicaRead(ctx context.Context, vcursor VCursor, bindVars map[string]*querypb.BindVariable, queries []*querypb.BoundQuery) { | ||
switch route.Opcode { | ||
case Unsharded, Scatter, Equal, EqualUnique, IN: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MultiEqual might also be something to consider in the opcodes to allow.
This PR is being marked as stale because it has been open for 30 days with no activity. To rectify, you may do any of the following:
If no action is taken within 7 days, this PR will be closed. |
This PR was closed because it has been stale for 7 days with no activity. |
Signed-off-by: Olga Shestopalova <[email protected]>
Signed-off-by: Olga Shestopalova <[email protected]>
Signed-off-by: Olga Shestopalova <[email protected]>
Signed-off-by: Olga Shestopalova <[email protected]>
d56d434
to
c8576ca
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was able to track down one of the unit test failures to this. Because we aren't passing in the context that we cancel later, instead we pass in context.Background
this causes the topo watchers to not shutdown properly, which eventually show up as leaked golang threads -
noleak.go:56: found unexpected goroutines:
[Goroutine 50 in state chan receive, with vitess.io/vitess/go/vt/topo/memorytopo.(*Conn).Watch.func1 on top of the stack:
goroutine 50 [chan receive]:
vitess.io/vitess/go/vt/topo/memorytopo.(*Conn).Watch.func1()
/Users/manangupta/vitess/go/vt/topo/memorytopo/watch.go:56 +0x5c
created by vitess.io/vitess/go/vt/topo/memorytopo.(*Conn).Watch in goroutine 15
/Users/manangupta/vitess/go/vt/topo/memorytopo/watch.go:55 +0x250
Goroutine 51 in state chan receive, with vitess.io/vitess/go/vt/topo.(*Server).WatchSrvVSchema.func1 on top of the stack:
goroutine 51 [chan receive]:
vitess.io/vitess/go/vt/topo.(*Server).WatchSrvVSchema.func1()
/Users/manangupta/vitess/go/vt/topo/srv_vschema.go:74 +0x9c
created by vitess.io/vitess/go/vt/topo.(*Server).WatchSrvVSchema in goroutine 15
/Users/manangupta/vitess/go/vt/topo/srv_vschema.go:70 +0x14c
Goroutine 52 in state select, with vitess.io/vitess/go/vt/vtgate.(*sandboxTopo).WatchSrvVSchema.func1 on top of the stack:
goroutine 52 [select]:
vitess.io/vitess/go/vt/vtgate.(*sandboxTopo).WatchSrvVSchema.func1()
/Users/manangupta/vitess/go/vt/vtgate/sandbox_test.go:323 +0x94
created by vitess.io/vitess/go/vt/vtgate.(*sandboxTopo).WatchSrvVSchema in goroutine 15
/Users/manangupta/vitess/go/vt/vtgate/sandbox_test.go:321 +0x16c
Goroutine 59 in state chan receive, with vitess.io/vitess/go/vt/topo/memorytopo.(*Conn).Watch.func1 on top of the stack:
goroutine 59 [chan receive]:
vitess.io/vitess/go/vt/topo/memorytopo.(*Conn).Watch.func1()
/Users/manangupta/vitess/go/vt/topo/memorytopo/watch.go:56 +0x5c
created by vitess.io/vitess/go/vt/topo/memorytopo.(*Conn).Watch in goroutine 15
/Users/manangupta/vitess/go/vt/topo/memorytopo/watch.go:55 +0x250
Goroutine 60 in state chan receive, with vitess.io/vitess/go/vt/topo.(*Server).WatchSrvVSchema.func1 on top of the stack:
goroutine 60 [chan receive]:
vitess.io/vitess/go/vt/topo.(*Server).WatchSrvVSchema.func1()
/Users/manangupta/vitess/go/vt/topo/srv_vschema.go:74 +0x9c
created by vitess.io/vitess/go/vt/topo.(*Server).WatchSrvVSchema in goroutine 15
/Users/manangupta/vitess/go/vt/topo/srv_vschema.go:70 +0x14c
Goroutine 61 in state select, with vitess.io/vitess/go/vt/vtgate.(*sandboxTopo).WatchSrvVSchema.func1 on top of the stack:
goroutine 61 [select]:
vitess.io/vitess/go/vt/vtgate.(*sandboxTopo).WatchSrvVSchema.func1()
/Users/manangupta/vitess/go/vt/vtgate/sandbox_test.go:323 +0x94
created by vitess.io/vitess/go/vt/vtgate.(*sandboxTopo).WatchSrvVSchema in goroutine 15
/Users/manangupta/vitess/go/vt/vtgate/sandbox_test.go:321 +0x16c
]
I would have committed these changes directly, but unfortunately I don't have the access to do that 😢 |
utils.MustMatch(t, wantQueriesReplica, replica.Queries) | ||
replica.Queries = nil | ||
|
||
_, err = executor.Execute(ctx, nil, "TestSelect", session, "insert into user (age, city) values (5, 'Boston')", map[string]*querypb.BindVariable{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_, err = executor.Execute(ctx, nil, "TestSelect", session, "insert into user (age, city) values (5, 'Boston')", map[string]*querypb.BindVariable{}) | |
_, err = executor.Execute(ctx, nil, "TestWarmingReads", session, "insert into user (age, city) values (5, 'Boston')", map[string]*querypb.BindVariable{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TestHelpOutput is also failing -
--- FAIL: TestHelpOutput (0.70s)
--- FAIL: TestHelpOutput/vtcombo (0.06s)
flags_test.go:142: []: (-want +got)
(
"""
... // 421 identical lines
--vttablet_skip_buildinfo_tags string comma-separated list of buildinfo tags to skip from merging with --init_tags. each tag is either an exact match or a regular expression of the form '/regexp/'. (default "/.*/")
--wait_for_backup_interval duration (init restore parameter) if this is greater than 0, instead of starting up empty when no backups are found, keep checking at this interval for a backup to appear
+ --warming-reads-percent int Percentage of reads on the primary to forward to replicas. Useful for keeping buffer pools warm (default 0)
+ --warming-reads-pool-size int Size of goroutine pool for warming reads (default 500) (default 500)
+ --warming-reads-query-timeout duration Timeout of warming read queries (default 5s) (default 5s)
--warn_memory_rows int Warning threshold for in-memory results. A row count higher than this amount will cause the VtGateWarnings.ResultsExceeded counter to be incremented. (default 30000)
--warn_payload_size int The warning threshold for query payloads in bytes. A payload greater than this threshold will cause the VtGateWarnings.WarnPayloadSizeExceeded counter to be incremented.
... // 11 identical lines
"""
)
--- FAIL: TestHelpOutput/vtgate (0.10s)
flags_test.go:142: []: (-want +got)
strings.Join({
... // 32472 identical bytes
"\n --warming-reads-pool-size int ",
" Size of goroutine pool for warming reads (default 500)",
+ " (default 500)",
"\n --warming-reads-query-timeout duration ",
" Timeout of warming read queri",
+ "es (d",
"e",
+ "fault 5",
"s",
+ ")",
" (default 5s)\n --warn_memory_rows int ",
" Warning threshold for in-memory results. ",
... // 563 identical bytes
}, "")
FAIL
This requires fixing the vtgate.txt and vtcombo.txt file in the flags/endtoend
directory to match the new expectation.
Co-authored-by: Manan Gupta <[email protected]> Signed-off-by: Olga Shestopalova <[email protected]>
Signed-off-by: Olga Shestopalova <[email protected]>
go/flags/endtoend/vtcombo.txt
Outdated
@@ -421,6 +421,9 @@ Flags: | |||
--vtgate_grpc_server_name string the server name to use to validate server certificate | |||
--vttablet_skip_buildinfo_tags string comma-separated list of buildinfo tags to skip from merging with --init_tags. each tag is either an exact match or a regular expression of the form '/regexp/'. (default "/.*/") | |||
--wait_for_backup_interval duration (init restore parameter) if this is greater than 0, instead of starting up empty when no backups are found, keep checking at this interval for a backup to appear | |||
--warming-reads-percent int Percentage of reads on the primary to forward to replicas. Useful for keeping buffer pools warm (default 0) | |||
--warming-reads-pool-size int Size of goroutine pool for warming reads (default 500) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Vitess, pools usually mean connection pools. When I read the description I thought it's a waitGroup or something like that. Actually it turns out to be a channel bool
whose capacity is set by this flag.
The flag name and description are misleading. They need to be changed to reflect the actual usage. It should be something like --warming-reads-concurrency
and be documented as Number of concurrent warming reads allowed
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I edited the PR description at the top to list all 3 flags. Once these changes are made, that needs to change again.
Signed-off-by: Olga Shestopalova <[email protected]>
Signed-off-by: Olga Shestopalova <[email protected]>
So close. flags_test is still failing. you can fix that after addressing the feedback and then this will be |
@ajm188 given https://vitess.slack.com/archives/CMDJ2KFEZ/p1695982611699689 can we say that this PR does not need a separate website docs PR to update the flags? |
Yep that's correct! |
Signed-off-by: Olga Shestopalova <[email protected]>
8d4030b
to
498d1dc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
What else is needed for approval here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! Thank you for the contribution!
Description
When reparenting to a replica, if that replica has recently been restarted, it will have a cold bufferpool, and for bufferpool-reliant workloads, this means a performance hit for a few minutes until the bufferpool of the new primary warms up.
As such, it would be great to have the ability to mirror a certain percentage of SELECTs from the current primary to the replicas at the vtgate level, so that when time comes to reparent, the replicas will have a warmer bufferpool than before and not suffer this consequence.
This adds three vtgate flags that can be used to enable and control mirroring a percentage of SELECTs from the current primary to replicas.
We've been running with this feature at HubSpot for years now and it has significantly improved performance during reparents/rolling restarts. Previously, rolling servers to release a fix/feature was risky and could impact running apps and customers, but now it's something invisible, largely because we are no longer reparenting to a replica with a cold bufferpool
Related Issue(s)
Fixes #13205
Checklist
Deployment Notes